Federated Queries for Comparative Effectiveness Research: Performance Analysis
نویسندگان
چکیده
This paper presents a study of the performance of federated queries implemented in a system that simulates the architecture proposed for the Scalable Architecture for Federated Translational Inquiries Network (SAFTINet). Performance tests were conducted using both physical hardware and virtual machines within the test laboratory of the Center for High Performance Computing at the University of Utah. Tests were performed on SAFTINet networks ranging from 4 to 32 nodes with databases containing synthetic data for several million patients. The results show that the caGrid FQE (Federated Query Engine) is capable and suitable for comparative effectiveness research (CER) federated queries given its nearly linear scalability as partner nodes increase in number. The results presented here are also important for the specification of the hardware required to run a CER grid.
منابع مشابه
A system to build distributed multivariate models and manage disparate data sharing policies: implementation in the scalable national network for effectiveness research
BACKGROUND Centralized and federated models for sharing data in research networks currently exist. To build multivariate data analysis for centralized networks, transfer of patient-level data to a central computation resource is necessary. The authors implemented distributed multivariate models for federated networks in which patient-level data is kept at each site and data exchange policies ar...
متن کاملOn the Role of the GRAPH Clause in the Performance of Federated SPARQL Queries
Federated SPARQL queries give unified answers from multiple and distributed SPARQL endpoints. A good example may be the collection of stops from different transport companies in the same city to create a route planning application. The performance of the evaluation of these types of queries is usually poor, a fact that makes difficult their use in real-life applications that need good performan...
متن کاملDesigning a Global Information Resource for Molecular Biology
Research in molecular biology is continuously producing an immense amount of data, but this information is spread over numerous heterogeneous data repositories. Their integration into a federated information system would drastically reduce the time a biologist has to spend browsing different WWW sites or databases in search for a particular piece of information. In this study we point out the s...
متن کاملComparative effectiveness research in DARTNet primary care practices: point of care data collection on hypoglycemia and over-the-counter and herbal use among patients diagnosed with diabetes.
BACKGROUND The Distributed Ambulatory Research in Therapeutics Network (DARTNet) is a federated network of electronic health record (EHR) data, designed as a platform for next-generation comparative effectiveness research in real-world settings. DARTNet links information from nonintegrated primary care clinics that use EHRs to deliver ambulatory care to overcome limitations with traditional obs...
متن کاملDynamic Join Order Optimization for SPARQL Endpoint Federation
The existing web of linked data inherently has distributed data sources. A federated SPARQL query system, which queries RDF data via multiple SPARQL endpoints, is expected to process queries on the basis of these distributed data sources. During a federated query, each data source may consist of a search space of nontrivial size. Therefore, finding the optimal join order to minimize the size of...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Studies in health technology and informatics
دوره 175 شماره
صفحات -
تاریخ انتشار 2012